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Abstract 

The construction of adaptive nonparametric procedures by means of wavelet thresh- 
olding techniques is now a classical topic in modern mathematical statistics. In this 
paper, we extend this framework to the analysis of nonparametric regression on sections 
of spin fiber bundles defined on the sphere. This can be viewed as a regression problem 
where the function to be estimated takes as its values algebraic curves (for instance, 
ellipses) rather than scalars, as usual. The problem is motivated by many important 
astrophysical applications, concerning for instance the analysis of the weak gravitational 
lensing effect, i.e. the distortion effect of gravity on the images of distant galaxies. We 
propose a thresholding procedure based upon the (mixed) spin needlets construction 
recently advocated by Geller and Marinucci (2008,2010) and Geller et al. (2008,2009), 
and we investigate their rates of convergence and their adaptive properties over spin 
Besov balls. 
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1 Introduction 



Over the last two decades, wavelet techniques have become a well-established tool for the 
analysis of statistical nonparametric problems, especially in the framework of minimax esti- 
mation. The seminal contribution in this area was provided by Donoho et al. in [16] , where it 
was proved that nonlinear wavelet estimators based on thresholding techniques achieve nearly 
optimal minimax rates (up to logarithmic terms) for a wide class of nonparametric estimation 
of unknown density and regression functions. The theory has been enormously developed ever 
since - we refer to [33] for a textbook reference. 

The bulk of this literature has focussed on estimation in standard Euclidean frameworks, 
such as R or R n . More recently, applications from various scientific fields have drawn a lot 
of attention on more general settings, such as spherical data or more general manifolds (see 
[T]). This environment has recently experienced a remarkable amount of activity, both from 
the purely mathematical point of view and in terms of applications to empirical data. 
In particular, a highly successful construction of a second-generation wavelet system on the 
sphere (the so-called needlets) has been introduced by [5T], [52]; this approach has been ex- 
tended to more general manifolds and unbounded support in the harmonic domain by [23] . 
[21], [25J . The investigation of the stochastic properties of needlets when implemented on 
spherical random fields is due to [3], [1], [13], [H] and [19], where applications to several 
statistical procedures are also considered. These procedures have been mainly motivated by 
issues arising in Cosmology and Astrophysics, and indeed several applications to experimental 
data have already been implemented: for instance, those from the satellite WMAP mission 
from NASA, focussing on the so-called Cosmic Microwave Background radiation, see [55], 
[18] , [56] . [18], [57], [58], [H], [59], [11] and [31]. These applications, however, have not been 
focussed on thresholding estimates and minimax results, but rather to random fields issues, 
such as angular power spectrum estimation, higher-order spectra, testing for Gaussianity and 
isotropy, and several others (see also [17].[T3]). 

More recently, a few papers have focussed on the use of needlets to develop estimators within 
the thresholding paradigm, in the framework of directional data. The pioneering contribution 
here is due to [5], see also [37], [38], [36]; applications to astrophysical data is still under way. 
Earlier results on minimax estimators for spherical data, outside the needlets approach, are 
due to Kim and coauthors (see [10], [32], WA)- 

Another important generalization of the needlet approach has been recently advocated by 
[2T] ; applications to statistics can be found in [20J. This development is again motivated by 
Cosmology and Astrophysics. In particular, we noted above as some extremely influential 
satellite missions from NASA and ESA (WMAP and Planck, respectively) are currently col- 
lecting data on the so-called Cosmic Microwave Background radiation, which can be viewed 
as the realization of a scalar, isotropic, mean-square continuous spherical random field (see 
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for instance for a review). These same experiments are also collecting data on a much 
more elusive cosmological feature, the so-called polarization of CMB. The latter can be loosely 
described as observations on random ellipses living on the tangent planes for each location 
on the celestial sphere. Mathematically, this can be expressed by defining random sections of 
spin fiber bundles, a generalization of the notion of scalar random fields (see [21], (20], [22] 
and Sections [2] below for much more details and discussion). Quite interestingly, exactly the 
same mathematical framework describes the so-called weak gravitational lensing induced on 
the observed shape of distant galaxies by clusters of matter. This is again a major issue in 
the analysis of astrophysical data (see for instance [7J ST] and the references therein): huge 
amount of observational data are expected in the next decade, by means of satellite missions 
in preparations such as Euclid. 

The applications of spin needlets to CMB polarization data is discussed in [JjJ]. The charac- 
terization of spin Besov spaces by means of needlets decompositions is discussed by [2] and 
[22] ; the latter reference also introduces an alternative construction for needlets on spin fiber 
bundles (so-called mixed needlets), and provide its analytical and statistical properties. 
Our purpose in this article is to exploit these results and classical techniques to introduce 
and develop spin nonparametric regression, with a view to applications to polarization and 
weak lensing data. In particular, we investigate the properties of nonlinear hard thresholding 
estimates, and we establish rates of convergence over a wide class of L p s norms and spin Besov 
spaces (see again [2], [22] and the sections to follow for more detailed definitions). More 
precisely, we shall assume to have observations on independent pairs of random variables, re- 
spectively scalar and spin, (Xj, Y^ s ), i = 1, . . . , n, (Xj) G S 2 ; we view (X;) as uniform random 
locations on the sphere, which correspond for instance to the positions of observed galaxies. 
We shall then be concerned with the regression model: 



where F s (■) is an unknown section of a spin fiber bundle; for instance, for s = 2 F s can 
be taken to represent the geometric effect of the gravitational shear. We assume that this 
section belongs to L P (E> 2 ), the space of the spin s, p-integrable sections on the sphere. On 
the other hand, we assume the e^ s are i.i.d. spin random variables, which can be viewed as 
an observational error (to be interpreted, for instance, as the intrinsic shape of the galaxy). 
We are then led to nonparametric estimation over an unknown functional class, and we aim 
at procedures which are robust (i.e. nearly optimal) for a wide class of L p s norms, 1 < p < oo. 
To address this issue, and given the properties of (mixed) spin needlets established in [2T] . 
[22] . we follow a classical approach, as discussed for the classical case on R by Donoho et 
al. ([16]). Hardle et al. ([33]), ([37]), and many other papers, see for instance ([8], [12], [31]) 
for some recent developments . In particular, as mentioned before we introduce thresholding 
estimates and establish convergence rates for the resulting nonlinear estimators. We stress 
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that we consider at the same time estimators based upon both spin constructions we have 
mentioned before, i.e. pure and mixed spin needlets; the results with the two approaches 
are identical. Sharp adaptation results for nonparametric regression on vector bundles were 
recently established in an important paper by [3D]. These authors focus on the p = 2, and 
therefore exploit Fourier methods rather than wavelets thresholding. For s — 1, our method 
can be viewed as a form of adaptive regression for vector fields, and in this sense it relates 
also to recent work on filament estimation by [2B] , [2H] ■ See also [BDJ for some recent work on 
statistical analysis for tensor-valued data. 

The plan of the paper is as follows. In Section 2 we review some background material on spin 
fiber bundles, while in Section 3 we recall the construction of spin and mixed spin needlets; for 
both sections we follow closely earlier references, in particular [21] and [22]. In Section 4 we 
review some crucial material on spin Besov spaces, as discussed earlier by [2] and [22] . Section 
5 and 6 include the most important contributions of this paper, namely the presentation of 
the thresholding procedure and the investigation of its asymptotic properties. 

2 Spin functions 

2.1 Background and definitions 

The purpose of this Section is to review some background material on spin fiber bundles; our 
presentation follows closely [21], [20], [22], to which we refer for more discussion and details. 
The concept of a spin function was introduced in the sixties by Newman and Penrose in [53] , 
while working on gravitational radiation, see also [30], [17]. Writing in a physicists' jargon, 
they said that a function r\ has an integer-valued spin weight s (or, briefly, that rj is a spin 
s quantity) if, whenever a tangent vector at point i 6 S 2 is rotated by an angle if) under 
a coordinate change, rj transforms as rj = e^rj. This same idea is formalized as follows 
by [2I])- Let Uj := S 2 / {iV, S} be the chart that covers the sphere with the North and the 
South poles subtracted: here we adopt the usual angular coordinates <^), $ G (O) 71 ") an d 
(p G [— 7r,7r]. Define the rotated charts Ur = RUj, where R G SO (3) (the special group 
of rotations) and label the corresponding coordinates ($r,<Pr)- For any x G § 2 , we can fix 
a "reference direction" in the tangent plane at x (labelled as usual T x (§ 2 )) by considering 
Pj (x) = d/dp, the unitary tangent vector in the direction of the circle where $ is constant and 
ip is increasing. For every x belonging to the intersection between the charts corresponding 
to Ur and Uj, we can uniquely measure the angle associated to a change of coordinate by 
considering the angle between the reference vector in the map Uj, and the reference vector 
in the rotated chart, namely p R (x) = d/d<p R . More generally, given x G § 2 and two charts 
Ur 1 and Ur 2 such that x G Ur 1 D Ur 2 , the angle between Ur 1 and Ur 2 , iP x r 1 r 2 is defined as 
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the angle between p Rl (x) and p R2 (x), see [21], [20] for a discussion on the orientation of this 
angle. 

Fix now an open subset G C S 2 . The collection of functions {Fr} RgS o(3) * s a s P^ n s function 
F s if and only if WR\, R2 G 5*0 (3) and all x G Ur 1 fl Ur 2 fl G we have: 

We write F s G (G), if for every R G 5*0 (3) the application x ^ F s (x) is smooth. Note 
that for s = 0we are back to the usual scalar functions. 

From a differential geometry point of view, is the space of sections over G of the complex 
line bundle over the sphere § 2 (see also [13], [IS] for more discussion on this point of view). 
The functional spaces W s (§ 2 ) are then defined as 



F s G LI (§ 2 ) & ||F S || L?(§2) = (j dz) 



l/p 

< 00 



Note that, while F s (x) is a section of the fiber bundle on § 2 , |F S (x)| is a real valued function 
on the sphere, because the modulus of F s does not depend on the choice of the coordinate 
system: therefore the IP S (S 2 ) is well defined. 

2.2 Spin Spherical Harmonics 

We start by recalling the well-known expression for the spherical Laplacian A§2, 

1 d 1 d f . d 

sin 17 oif z sm v ov y ov 

A complete orthonormal set of eigenf unctions for the spherical Laplacian is provided by the 
family of spherical harmonics {V5 m }, I = 0, 1, 2, m = —I, : 

A&Y lm = -l(! + l)Y lm , [ Y lm (x)Y lrn {x)dx = 8 l l5% , 

J § 2 

where 5 b a denotes the Kronecker delta function. In the spherical coordinates (1?, <p) 



p lm (x) = (1 - x^-^Ptix) , 
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where Pi(x) denotes the Legendre polynomials, see for instance (102]) for more analytic expres- 
sions and discussion. Denoting by {Hi} the linear spaces spanned by the spherical harmonics, 
the following decomposition holds (see for instance [1]): 

L 2 (S 2 )=©^, 



l>0 



that is, in the Ir sense, for all f E L 2 (§ 2 ) 

f( x ) = 22 aimYlm ( x ) 

Lm 



(Hi: 



f(x)Y lm (x)dx 



It is possible to introduce spin spherical harmonics as the eigenfunctions of a second-order 
differential operator which generalizes the spherical Laplacian (refer again to [64|,[2I] for more 
details). To this aim, consider the (spin raising and spin lowering) operators 9 and 5, whose 
action on a spin function F s (•) is provided by: 



(sin(0))- 



d id 
8$ sin (■&) dip 
d i d 



dfl sin ($) dip 



sm(6))- s F s ($,ip) , 
(sm(6)) s F s ($,ip) . 



It should be noted that S transforms spin s functions into spin s + 1 functions, 9C^° — > 
C^_i, while 9 transforms spin s functions into spin s — 1 functions, SC^° — > C^L l , which 
justifies their names. The previous expressions should be written more rigorously in terms of 
(5r, 9r, ip R , F s -r, because both the operators and the spin functions depend on the choice 
of coordinates. More important, 9, 5 can be used to define a differential operator SS, which 
can be viewed as a generalization of the scalar spherical Laplacian; indeed 



-55F, 



lm:s 



eisYi 



lm;s 



where {e ls } l=s s+1 = {(/ - s) (I + s 
{Yim-,s} , I = s,s + l,...;m = -I,. 
which we define by 



i)}/=ss+i is the associated sequence of eigenvalues and 
, I is the sequence of orthonormal spherical harmonics, 



Y, 



lrn:s 



Y, 



lm:s 



(*-«)! ■ 

(1 + 8)1' 

' (*+*) 

(l-s)\- 



Y Xm for s > 



W lm for s < . 
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Again, as before it should be noted that in the spin case the operators depend on the choice 
of the coordinates, differently from the scalar case. As discussed by [2D], [IS], [IS] the spin 
construction could be alternatively provided in terms of the so-called spin-weighted represen- 
tation of the special group of rotations 50(3), indeed spin spherical harmonics can be related 
to the so-called Wigner's matrices, see again [62], [63]. In particular, it is then possible to show 
that the spin spherical harmonics are themselves an orthonormal system, i.e. they satisfy 

/ Yim; S Y im-gdx = / Yi m . s ($,ip)Yi m . s ($, <p) sirnMiM^ = SfS^ . 
Js 2 Jo Jo 

As for the scalar case, 

oo 

L 2 s (S 2 ) = (J) Hi Hi := span {Y lm . s ; m = -/,..., 1} , 

1=0 

and the following representation holds 

F s (x) = \] ai m . s Y lm;s (x) , 

l m 

in the L 2 S sense, i.e. 

2 

L I 

F s (g) ~ ^ ^ aim; S Yi m;s {x) dx = . 

l=\s\ m=— I 

Here, the spherical harmonics coefficients ai m . s := L 2 F s Y[ m dx are such that 

a lm;s = a lm;E + ^ a lm;M j 

where {a; m; ^} , {a« m; M} are the coefficients of two standard (scalar-valued) spherical functions, 
which in the physical literature are labelled the electric and magnetic components of the spin 
function F s , see again [21], [22J for more discussion. 

3 Spin and Mixed Needlets 
3.1 Definition 



lim 

L— >OD 



We start by recalling the definition of scalar needlets, which were introduced by pxT] and 
as: 

^ik 0) 
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here \jk} are a set of cubature points and weights ensuring that: 

J~] ^jkYlm fejfc) Y Vm , (£ jk ) = / Yi m (x) Y Vm < 0) dx = <*f 5™', 

6 (•) is a compactly supported C°° function satisfying the partition of unity property: 

for all I > 1 , and -B > 1 is a bandwidth parameter. For a fixed value of S, we denote 
{Xj}j >0 the nested sequence of cubature points corresponding to the space /C^si+i], where 
[•] represents the integer part and Kl — ©iLqUi is the space spanned by spherical harmonics 
up to order L. For each j, the cubature points are almost distributed as an Oj-net, with 
atj := kB~i , the coefficients {Xjk} are such that cB~ 2 i < Xjj, < CB~~ 2 \ with c,C Gl, and 
Nj = card{Xj} 5 2jf , see for instance [1] for more details. 

The construction of spin needlets (as provided by ( [21 J ) ) is formally similar to the scalar 
case, although as we discuss below it entails deep differences in terms of the spaces involved. 
Indeed, spin needlets are defined as follows: 

^jk;s 0) = \^JkJ2 b [^r) S Ylm ' s fefc) Ylm ' s ^ ' ( 2 ) 
I ^ ' m=-l 

where {\jk, Cjk} are ; as before, cubature weights and cubature points, b(-) G C°° is non- 
negative, it is compactly supported in [1/B,B] and satisfies the partition of unity property. 
Note, however, that the mathematical meaning of (T5]) is rather different from the scalar case; 
indeed ipjk- s (x) is to be viewed as a spin s function with respect to rotations of the tan- 
gent plane T x , and a spin — s function with respect to rotations of the tangent plane . 
Moreover, as Yi m . s {Cj k ) >^1m;s ( x ) hve on two different tangent planes T^. k ,T x , the product 
Yim-s {£,jk) Yi m -s (x) is not defined and the notation Y [m . s (£ jfc ) ® Yi m . s (x) would be more ap- 
propriate. As a consequence, the spin needlet operators acts on spin s functions to produce 
spin s coefficients 



( F s, *Pjk;s 0*0) = / 2 Fs(x)lp jk . s (X) dx 

i^r) aim ^ Yim ^ fcfc) 



Irn 
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Therefore, V\-fc ;s induces the linear map (EJ) from spin s quantities to spin s wavelet coefficients 
Pjk-g, while in the scalar case (s = 0) needlets generate a linear map from scalar quantities 
to scalar quantities. Indeed, if u is a spin s vector at £ jfe , ipj k . s (x) u becomes a spin s vector 
at £j k , since the product of spin — s and spin s vectors at a point x is a well-defined complex 
number, independently of the choice of coordinate system. 

To provide a clearer interpretation to the previous expression, recall the decomposition of 
the functional space L 2 S (S 2 ) = (>o We can hence define the following operators on Hf. 



Kj(x,y) = J2b 2 (^§^jY lm . s (x)Y lm . s (y) 

A i 0> y) = Yl b i^f) Ylm ' s ^ Ylm ' s ( y ) 



such that the reproducing kernel property holds: 

Aj (x, y) Aj (y, z) dy = Kj(x, z) 



Spin needlets can be derived by discretizing this operator by using the reproducing kernel 
property. In fact Aj is such that: 

z — > Aj (x, z) G /C[£2j+i] , 

and therefore: 

z -> Aj (x, z) Aj (z, y) e /C[ B 4j+ 2 ] . 
After discretization, we obtain: 

K i ( x > V)= X ikAj (x, £ jk ) Aj (£ jk , y) , 

^fc e/c [s 4 J+ 2 ] 

where we exploit the fact that the pairs {^jh,Cjk} can be chosen to form exact cubature 
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points and weights (|2J). Then 

Kjf(x) = [ K j (x,y)f{y)dy 



= Yl x Jk A j ( x » €jk) A i fcfc> y) f (y) d v 

?jfee/C[ i3 4i+2] 

= $ jk-^ jk;s > 

where 

As a minor point, note that for the argument of the function b (•) we have used here the 
square root of e^ jS ,the eigenvalue of the corresponding spin spherical harmonics, while in the 
scalar case [51], [52] proposed to adopt I. However it is trivial to observe that, for fixed s: 

V^I y/(l - S )(l + S + 1) 

iim JL - — = hm — = 1 . 

I— >oo / I— >oo I 

3.2 Some Properties 

We report some important properties for spin needlets, very similar to those in scalar case (see 
[5T] . [52J). Indeed, from the previous discussion it follows easily that V'jjfcja * s a well-defined 
scalar quantity. The following Localization property is hence well-defined (see [21]): for any 
MeN, there exists a constant cm > such that for every x G S 2 : 

(l + ^arccos((^. fc ,x))) 

Let us recall from ((3]) that 

Pjk ]S = F s {x) Y jk;s (x) dx = \f\k~Y^ h ("^r) S a '"»s» i W feJ » 
and the following reconstruction formula holds: 
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It is simple to check that the squared coefficients \/3j k . s \ following quantities are scalar. In 
the following, we will need both the L 2 (S> 2 ) and the LP S (§ 2 ) norm of ipj k . s . Let us start by 
observing that: 

IkifcjJI^sa) = x i k E 1)2 ( ~§r ) E Yim ' s fc fc ) Yim ' s / Yim -> s ^ Yim ' s ( x ) dx = 

l ^ ' m=-l 

= v E #\rw) E *w fe*) y^-., «.,,) 

i=EP'-i ^ ' m=—l 



- T 2 



As discussed by [I], [2J, [2T], there exist positive constants Ci,c 2 such that C\N- 1 < Aj^ < 

Throughout the rest of the paper, to simplify notations we shall assume to be dealing with 
sections of line bundles such that F s = (I — P S )F S , P s denoting the projection operator on the 
s spin spherical harmonics. In other words, the component at I — s is assumed to be null; 
from the point of view of motivating applications, this is a very reasonable assumption, indeed 
for polarization or weak lensing experiments the so-called quadrupole term I — s — 2 has no 
physical meaning. The situation is indeed analogous to the standard scalar case, where the 
constant term s = cannot even be measured by ongoing (so-called differential) experiments. 
Under these circumstances, as shown in [2J, spin needlets make up a tight frame system, i.e. 
for all F s e L 2 S (S 2 ) , 

ll^s|lx|(s 2 ) = E l^i^l ' 

jk 

whence we have easily 

E \ ^ jlM;s^ jk:s) \ 2 = ||VVl,fci;s|| L 2(g2) + E li^hM^jk-.s)] 2 < ||^ji,fci;s|| L 2 ( §2) ) 
jk j^j 1) k^k 1 

whence 

1 1 I La (§2) < 1 • 

More generally, it is shown in [2J, [22] that for all 1 < p < oo, there exist positive constants 
c p , C p such that 

i 

c p B 2 ^) < \\^ jk , s \\ L ^) = (jf kjk-Idxj P < C P B 2 ^). (4) 
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3.3 Mixed Needlets and their properties 

Mixed Needlets were introduced in [22J; they are defined as 



7^>l ol ^ / rrr, 



^jk;sM \ x 

l>\s\ 

with corresponding needlet coefficients 

Mixed needlets form a tight frame system, with the same set of cubature points and weights 
as for the scalar case, {£,;., Xjk} ■ When F s G L 2 (S> 2 ) , we have also 



ft jk\sM 

l>\s 

and mixed needlets form a tight frame system. It should be noted that the coefficients 
{(3j k . sM ] are scalar, complex- valued random variables, indeed for square integrable sections 
we have 

0jk;sM = V^jfc ^ b \~^r) { aim > E + ia lm:M} Y lm (£,jk) 

l>\s\ ^ ' m 

= '■ Pjk:E + iPjk:M ' 



where (3j k . E , (3j k . M could be viewed as the scalar needlet coefficients of standard square inte- 
grable functions on the sphere. For general F s G L p s (S 2 ) the reconstruction formula holds, in 
the LP S sense: 



Other properties of mixed needlets are analogous to those for the pure spin construction. 
In particular, note that scalar and pure spin needlets are both constructed by a convolution 
of a smooth function b(.) with projection operators such as, for instance, Y2 m Yi m (x)Y i m (y) , 

On the other mixed needlets are built by convolving b{.) with J2 m Yi m (x)Y i m]S (y) , which is 
not a projection operator (indeed ^2 m Yi m (x)Yi m . s (x) = 0). It comes therefore to some extent 
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as a surprise that mixed needlets do indeed enjoy localization properties, indeed we have (see 
again [22]): for each M > there exists a constant Cm such that: 



jk;sM 



< 



(l + B 



i arccos 



M 



Building upon this localization property, it is indeed possible to establish the following bounds 
(see for more details [22J): 



jk;sM\\LP(S, 2 ) 



< c 2 B 



*(M) 



ci, c 2 > . 



(5) 



These constraints on the L p s norms will have the greatest importance for our results to follow. 
Also, for positive constants c 3 ,c 4 and arbitrary coefficients we have 



sM\\inp) 



< 



Xk ^3k 



sM 



<c 4 ^|A fc | p ||^ fc . 



if(S 2 ) 



sM\\l%{&) 



(6) 



Remark 1 While the mathematical construction and the properties that can be developed on 
the mixed needlets are very similar to the spin case, there is a very relevant difference among 
these approaches that will be very important for our purposes. While, as we have already seen, 
ipjk-s is formed by a tensorial product among two terms belonging to two different spaces of 
spin —s and s such that (3j ks belongs to the spin s space, i^msM induces a linear map from a 
spin s vector at £ to a scalar (spin 0) quantity, such that for a spin s quantity u, the product 

^jk-sM ' u i s a l wa U s a scalar quantity. . 



4 Spin Besov spaces 

Our aim in this Section is to recall the definition of spin Besov spaces in terms of approximation 
properties. These definitions and their characterizations were provided by [2], [22], to which 
we refer for further details and discussion. Define first, 

G k (F s , 7r) = inf \\F S - H S \\ L , §2) , 

i.e. the approximation error when replacing F s by an element in "H^ . Then the Besov spin 
space B r pq . s is defined as the space of functions such that F s e L p s (S 2 ) and 

\^\{k r G k {F s ^))A <oo. 

\k=0 J 
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As usual, the last condition can be easily shown to be equivalent to 

' oo \ 

J2{B 3r G B3 (F s ,7r)) q ) <oo. 



0=0 



Moreover, F s G if and only if, for every j = 1,2, 



E (1^.1 \\t 



i fc i«llLj(S 2 ) 



ej B->* 



where Ej G £ q and B > 1. By defining the Besov norm as follows, 



I-^Hlj(s 2 ) + 

II-^IIlj(§ 2 ) 



+ su V B j ( r+ ^) || {P jk . a ) k \\ u, if q = oo 



we obtain that, if max (0, 1/tt — l/g) < r and 7r, g > 1, then 



Besov spaces are characterized by come convenient embeddings, which (as always in this 
literature) will play a crucial role in our proofs to follow. More precisely, we have that, for 
tti < tt 2 , qi < q 2 



or r tor tor r tor K>r r k> ^2 

^■Kqws ^- LJ Tvq 2 ;s 1 LJ TT 2 q;s ^~ ^7ri<?;s 5 *~Vi<ji;s ^- LJ ^2q\s 

The proof of ([7]) is exactly the same as for the scalar case, see [5]. In particular 



(7) 



B r 

tti<?;s 



C £?oooo|s > SUP |/3jfc ;s | ||V ; jfc||ioo(§2- ) — £ jB "* K1 

=> ^' (r+1 -^sup|/3, fc .J<oo 



5 J sup \/3 

k 



jk;s\ 



< OO . 



5 Nonparametric Regression on Spin Fiber Bundles 

5.1 The Regression Model 

We start by recalling the regression formula (TjQ): 
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Throughout this paper, we shall also assume that sup 2 ,|F s (x)| = M < oo . As discussed 
in the Introduction, we envisage a situation where it is possible to collect data which can 
be viewed as measurements on a spin fiber bundles, i.e. for instance the polarization of the 
Cosmic Microwave Background (see [35], [61], [10], [20], [IS]), or the Weak Gravitational Lens- 
ing effect on the images of distant Galaxies (see [7]). To fix ideas, we focus on this second 
example. As discussed for instance in [15], the gravitational shear effect may be loosely de- 
scribed as gravity transforming into a more elliptical shape the image of galaxies. Of course 
the measurement of this shear is subject to an experimental error, for instance because of 
the unknown intrinsic ellipticity of the observed galaxy. Likewise, the weak gravitational 
lensing may produce an alignment in the inclination of nearby observations, but again this 
could be brought in by random fluctuations. We refer to [7, ?] for much more detailed dis- 
cussion on motivations and related challenges, which currently involve huge amount of physi- 
cists; major satellite experiments are at the planning stage, such as Euclid, see for instance 
|http: / /hetdex.org/ other_projects/euclid.php] To model the above discussed framework, we 
introduce random directions of observations {Xi 6 § 2 }, which we take to be uniformly sam- 
pled over the sky, and observational errors {£i- s }, i = 1,2, ...,n; the latter are independent 
and identically distributed spin s random variables, which we assume to be invariant in law 
with respect to rotations in the tangent plane: 

4, = e i;s e is ^, for all ip G [0, 2n] , i = 1, 2, .., n, (8) 

= denoting equality in law. As in [6], implies that 

Re ei- s = Imei-s = Ei . (9) 

From ([8]), (Q we have immediately 

E [ei. s } = E [Re £i. s + Jlme i;s ] = , 

Var (£i, 8 ) = E\E ils \ 2 = 2Eg =: a\ . 

Moreover, we shall assume that {Hi} follows a sub-Gaussian distribution (ref.[9]), i.e. there 
exists a number a > such that for all A G R the following inequality holds: 

E [e A?I ] < e(^) . (10) 
We also define the sub- Gaussian standard of the random variable £j as: 

r (ei) = inf { a > : E [e A?l ] < e(^) , A G R 1 < oo . 
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It is immediate to check ( see [H]) that: 



T (Si 



sup 



2 log (E [e Xe ']) 



A 



E [e xx ] < e — a — 



As well-known, a random variable is sub-Gaussian if and only if the moment generating 
function is majorized by the moment generating function of a zero-mean Gaussian random 
variable, whence the name sub-Gaussian. Indeed, the class of sub-Gaussian random variables 
contains, apart from the Gaussian themselves, all bounded zero-mean random variables and, 
more generally, all those random variables whose distribution tails decrease no slower than 
the tails of the Gaussian. We recall the following, simple results, whose proofs are available 
in 0: 

Lemma 2 Moment characterization for subGaussian random variables. 

Let e be a subGaussian random variable such that E (e) = 0. We have that E ((e) 2 ) < r (e) 

and for all p > E i\s\ p ) < oo. 



In view of Lemma ([2]), subGaussian random variables enjoy the same moment inequalities 
and concentration properties as Gaussian or bounded ones, and hence allow the implemen- 
tation of the main technical tools in the proofs of our asymptotic results to follow. In this 
sense, they seem to provide a natural general framework for the analysis we must pursue. 



5.2 The estimation procedure 

The procedure we are going to investigate can be viewed as a form of needlet thresholding 
in the spin fiber bundles case (we refer to [5] for a similar approach, in the case of density 
estimation for standard scalar directional data). As discussed in the Introduction, we have now 
two alternative forms of needlets construction for the spin case, i.e. the pure spin needlets 
of [21] and the mixed spin needlets of [22]. Our approach could be implemented for both 
techniques, and indeed the proofs would be nearly identical. For definiteness, we shall focus 
on the mixed needlets constructions, which yields coefficients which are standard, complex- 
valued variables. For brevity's sake, however, we drop the subscript M.. We start by defining, 
as usual, an unbiased estimator for needlet coefficients. More precisely, we define 

1 - - 
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We have immediately: 

1 n 

E (?**;.) = "E £ fe* (Xi) F s (X,) + iJ jk . s (X t ) e l]S ] = 
i=i 

= f ^ jk . s (X,) F s (X,) = f} jk . s . (11) 

JS 2 

Moreover 

(1 n 1 n \ 

-^ jk ;s(X i )F s (X i ) + -^ j k; S (Xi)e i;s \ = (12) 
i=l i=l / 

i " 1 n _ 

= ~2J2 Var fe;* M F.{X t )) + ~ 2 Y. Var (^;s (Xi) e l;s ) . 

Now 



n A z — ' rr 

i=i i=i 



1 2 1 2 2 1 2 

i=i 



1 — 1 

— £ Var (^. fc;s (X 4 ) e i;s ) = -a £ 2 ||^- fc;a 



where in the last equality we used the independence of the £j. s . Note that obviously erf ■ < a\. 
Also 

0<-J2Var @ jk;s (X,) F.{Xi)) = - / (x) F s (x) f dx < — , 

n ~— n j S 2 n 

and we define a\ j := a\ e ^ + ^ . We then proceed with the (now classical) hard thresholding 
procedure (see for instance [16], [33] and [12] ). In particular, we fix the threshold as 



Kt n = k\ (13) 
V n 

where k is a real positive constant, whose value will be discussed later. Hence we define as 
usual 

P)k-s = WjkP^s , w jk = *{$ jk!t>Ktn } , (14) 

where Ia denotes as usual the indicator function of the set A. The thresholding estimator is 
hence 

J„ *i 

F:(x) = J2J2( 3 ks^k;s(x) ■ (15) 
j=l k=l 



In [T51 J n represents a cut-off frequency, which we shall fix at B Jn = whereas N. 



is 



the cardinality of the cubature point set at frequency j; it is known (see for instance [I]) that 
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there exist positive constants C\,C2 such that C\B 2 i < Nj < C2-B 2j (written pa B 2 i). Our 
main result is to show that thresholding estimates achieve 'nearly optimal' (up to logarithmic 
factors) rates with respect to general L p s (§ 2 ) loss functions.. 

Theorem 3 Let F s G B^ q . s (G), the Besov ball such that \\F s \\ Br < G < oo, r — | > 0, and 
consider F* defined by fl5i ITM \IJ$ . For 1 < p < oo, there exist k > such that we have 

-a(r,ir,p) 



sup E\\F* S -F S \\ P LP <C p {\ognY 



a(r,ir,p) 



n 



logn 



rp 



for ix > Y^y: ( regular zone ) 



2r+2 J — 2r+2 

P(r-2(i-i)) 



2(r-2(- 



2p 
2r+2 



(sparse zone) 



Also, for p = oo 



sup 



s\\L° 



< a 



logn 



-a(r,n,oo) 



«(r, 7T, oo) 



r — 



2(r- 2(I-§)) • 



Remark 4 TTie definitions of regular and sparse zones are classical, and so are the rates we 
obtained, which indeed correspond (for instance) to those presented by |5j/. For brevity's sake, 
we do not prove that these rates are indeed minimax (up to logarithmic terms), but it seems 
easy to achieve this goal by application of classical arguments, as for instance presented by 
f3iy . It is trivial to note that for ir = 



-7T we have 

r+1 



2 (r±l 
V p 



I)) 



r(p — 2) 



2(r- 2(1-|)) 



2(r-2(^±l-l)) 2r(p-2) + 2(p-2) 



and also 



rp 



2r + 2 
rp 



2r + 2 

Of course a(r,n,p) < 1 , lim r 



> 



< 



2r + 2 ' 



i)) 



2(r 
p(r 



2(1 
2(1 



D) 

a) 



2(r - 2(1 
a(r,7r,p) = 



m £/ie regular zone , 
in the sparse zone . 



Remark 5 For s = 0, ow results cover adaptive nonparametric regression for complex- 
valued, scalar functions. Again, the rates correspond to the usual nearly minimax bounds. 



The proof of Theorem [3] is provided in the Section to follow. 
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6 Proofs 

Our arguments will follow classical approaches in this presented for instance by 

6.1 An auxiliary result 

We shall need in the sequel some sharp bounds which are provided in the following result. 
The arguments are close, for instance, to those for the inequality (65) on page 1088 of [37] 
where the case of a scalar Gaussian noise is considered: see also Proposition 15 in [5]. 

Proposition 6 Let {ei ]S } be such that and (TTZ|) are fulfilled. Assume also that M : = 
II F s I loo < 00 ■ For all 7 > and for all j such that < yn] logn, there exists k 7 > such 
that for k > K-y the following inequality holds: 



P 



n 



E ^Jk;s 



i=l 



n 



where 7 ~ k 4 / 3 . Moreover, for all p > we have 

pi 



E 
E 



ft jk\s ft jk\ 



sup 

k 



< C p n 2 



ft jk ;s ~ ft jk ; 



< cuj + iy- 1 n- p / 2 . 



(16) 

(17) 
(18) 



Remark 7 It is possible to obtain sharp analytic expressions for k, C p , C^, for instance by 
arguing as in Lemma 16 of f^j. 



Proof Note first that 



- 1 — 

= n E ^3k;s {Fs (Xi) + Si 



1=1 



and 



ft jk;s ftjk;s 



1 ™ 

i=i 

-. n 1 71 

= - E i (^Fs M) -E @ jk;s F s (X t )) } + -^r^r, 

i=l i=l 
^ n 1 n 



(19) 
(20) 



i=l 



t=l 
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where 



-V,S := 1> jkia {Xi)F a (X t ) - E @ jk . a {Xi)F a (X,)) 
Consider (ar) := P (^ jk . a - jk . s \ > x): 



P/3 0) < Pf (x) + P e (x) 



(21) 



where: 




As before, we can split these sums into a real and imaginary part, to which we can apply 
separately the following procedures for both real and imaginary part in P^ (x) and P e (x), 
that give the same results. 

As far as P^ (x) is concerned, we use the fact that ^fjk ;s (Xi) are i.i.d random variables such 
that for each of them: 



suv\y jk . a {Xi)\ < 2cMB j 

E (>I' ; , : ,(.V,) 2 ) < E ($ jk , s (Xi) F s (A,)| 2 ) < M 2 II^JI^j < M 2 . 

We therefore apply Bernstein inequality: For a sequence of i.i.d. random variables {Xj}" =1 
such that E [JQ] = 0, \X { \ < M and E [X 2 ] = a 2 , we have 




(22) 



see for instance [33J. for a proof. 

By applying Bernstein, we obtain: 




(23) 



where the value 4 takes on count both real and imaginary parts. 
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Fixing x = nt n , the following result is obtained: 

n ((k/2) y/\ogn/n^ 

F F (nt n ) < 4exp 



V 



§ ( 3M 2 + cMB^k^J l -^Jl 



and by choosing j such that < y^y^; 



¥ F (Kt n ) < 4exp 



3k 2 logn 



8M (3M + ck) 



2fl 8M(3M+ck) 



(24) 



As far as P e (x) is concerned, consider that conditionally on (X[, . . . , X' n ), - ^ - fc . s (X^)£t ;s 

is a complex- valued subGaussian variable with mean and variance ^ SILi |Vjfc; S (^)| 2(T e- 
Therefore, by using the Markov's inequality, we obtain: 



P e (x) < E exp 



-nx 



x' 1: . . . ,x' n 



Observe that |-0 jfc . s are i.i.d. variables bounded by CB 2 \ such that E (JV'jfc-s (X/) 

is 2 l^jfcisl 2 ^ = ||V'jfe;s|| i 2('§2' ) < 1- Therefore we split the denominator into 2 terms, using 



1 and I r , 

<«} { ^ELik Jfc;s (^)l 2 -lk 3fc ; S |L2 (s2) >«} 



, a > , 



P e (x) < exp 



P 



1 



i=i 



8a 2 (1 + a) 

Now, by fixing x = Kt n , we obtain the following result: 

k 2 log n 



2 

•s 112 



> a 



P £ («*n) < exp 



8a 2 (a + 

Now, we use on the second term the Hoeffding's inequality: 



2 II / || 2 
~ IFjfc;s|| L 2 (S 2 ) 



> a 



P 



(1 n 
-Efe;.Mr-|fe; 



«IILJ(S2) 
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> a < 2 exp 



2n 2 a 2 1 



~ncB 2 if 



Again, because B 2 i < -r^ 2 — , we obtain: 

° ' — log n ' 



P 6 («t n ) < 2^exp 



2a 2 log n 



+ exp 



k 2 log n 
8a 2 (a + 1) 



(25) 



2 < n ~ + n 8<ri(o+i) 



We fix a ~ fcs in order to obtain the same order of magnitude between the two terms, and 
by using |2U and [25] finally we obtain: 



P £ ,Pf <C-n 
In order to prove Q21 we use again [191 to obtain: 



cfc 4 / 3 



< 2^ E 



P jk;s P jk: 

- {i>ik; S Fs (X)) - E (f jh . s F s (X,)) 



i=i 



E 



n 

n 

i=l 



= 2P" 1 (E F + E £ ) . 

We need to split again both Ep and E e into real and imaginary parts. Note that 

p- 



E f 



E 



1 

1 n 



< 2 13 - 1 \E 

< 2 p - 1 (E 1 F + E 2 F ) 



8=1 



1 " 

-J^Im *ik;«M 

i=l 



and 



E 



1 n — — 

~ ( Re &jk;s (Xi) £i;s} + Im {ip jk . s (X,) e i;s }) 
i=i 



< 7P- X I £ 



^ Re {^ fc . a (X,) )+eI l^2 lm ft*;- M 

i=l / \ i=l 



1(26) 
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For < p < 2, we apply the classical convexity inequality which states that for < p < 2 a 
for independent random variables Zj such that E [Z^) = and E {\Zi\ p ) < oo: 



E 



i=i 



< E 



1=1 



2\ \ 2 



As noted for instance in [53], in the case 2 < p < oo, we obtain a very similar result by 
applying the Rosenthal inequality i.e.: 

Let {Zi}™ =1 be independent random variable such that E (Zi) = and for p > 2, 
E (\ZA P ) < oo. Then there exists C„ such that: 



E 



i=i 



i=i 



j=i 



(27) 



A proof of proposition ( 1271) can be found for instance in [33] . We apply ( 1271) to each term in 
(BSD to obtain: 



2\\f 



El < C n 



Ei < C n 



E (|Re ^.(XQn (|Re ^(A,) | 2 ) ) 



e 



EQlm^iX^n , (£(|Imtt iJfc; .(X 4 )| 2 ))' 



£ 

722 



/ 



E(\Re ty jk;s (X,) e lis ) | ) E |Re fe ;s (X,) 



£ 

722 



£ (|lm ty jk . s (Xi) e ryS ) \ P ) (E (|lm (A,) e l;s ) 



V 



£ 

n2 



Recalling that £P < ^/j^ < we obtain: 



E(Re|^ fc;s (A,)| p ) = E(Im|^ fc;s (A,)n < 
< Ete fc;s (X i )F s (X i )| P ) < / Infers {Xi) F s (X^ \ P dx 



< cM p B' {p - 2) < cM p n 



p-2 



(28) 
(29) 

(30) 
(31) 
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As far as the noise-related terms, we obtain: 

E (|Re fe. s (X t ) e l;s ) | P ) = E (|lm (X,) e l;s ) | 
< E (\e i;s \ p ) cB j( *-V < cn -^ . 



Then, by substituting the last inequalities in [2H1 [2H1 EHl and EU we obtain: 

-l 



112 



n 2 



Now we study the case p = oo: in order to prove ( Tl8|) . we majorize: 

\ 

> x J dx . 

Recalling the procedure used in the proof of [T6], for < y/n, ( ]23l becomes: 





sup 


fijk;s @jk;s 




< / x^P (sup 

















P F (x) < 4 exp 



+ exp 



16M 2 y 

while, in a similar way, we split the first term on [25] as: 



3\/nx\ 
' lQcM ) 



exp 



nx 



ia 2 (l + a] 



< 



exp 



nx 



16al 



exp 



nx 



16a 2 a 



P* fx) +Pi 



(32) 



(33) 



(34) 



By applying on the last term of [25] the Hoeffding inequality and for < y/n, we obtain: 

2n 2 a 2 ' 



P | - 

n 



El* 



jk;s | 



\1>i 



8=1 



> a < exp 



< 



exp 



ncB 2 i 
2a 2 



p: 



e.a 



We choose a 



„l/3 



32V3 2/3 



• ri 1 / 3 ^ 2 / 3 L to obtai 



am 



PL + PL<Cexp 



and in view of Q2Tp, ([33]), ([35]), ([3 

2 



nx 2 \ ( nx 2 \ 

exp (-i6?ij +exp r^j +exp 



"27/ 3(T 4/3 cl/ 3 

2y / nx\ 
16cM J 



(35) 



+ exp 



^2/3 ^,4/3 
2 7 /3^/3 c i/ 3 
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Now we fix a parameter a = max (Ay/2a £ ,A\/2M, f cM, 2 n / 4 a E c 1/4 ) . Write (|32J) as: 



E 



sup 

k 



fijk;s ft jk;s 



+2c / Cx p - 1 B j 



x> 



< 



exp 



x 



:p - x dx + 



0<x<^t 

- — ^/n 



nx 



16a 2 £ 



exp 



nx 



16M 2 



+ exp 



2y / ro\ 
' lQcM ) 



+ exp 



nx 



2 7 a 2 £ c ± 2 



E lo + E lo + E L ■ 



We observe that for each term depending on exp (—nx 2 /C), where C = A\/2a £ ,A\/2M , and 
for x > aj I \/n, we have: 



nx 



C 



< 



B J exp 

Similarly, we have for x > aj/\/n: 

B 3 exp 

and finally, again for x > aj/y/n 

B J exp 



exp 



nx 2 nx 2 



2C 2C 



2^/nx\ 
lQcM ) 



^2/3 ^,4/3 

2 7 /3at /3 



< 



exp 



< exp 



+ j < exp 



2y/nx\ 
32cM J ' 



^2/3 ^,4/3 



2C 



Likewise, the integral .E 1 * is simply majorized by: 



El < C-( -4=Y < C p fn~^ 2 . 



As far as E 2 ^ is concerned, by using a change of variable u = y/nx we obtain: 



(37) 



El < 2C- 



1 



n 



-p/2 



u p 1 exp 



u>aj 



U 



4/3 



2 10/3 4/3 cV3 



du < C n n- p/2 . 



(38) 



A similar procedure is applied to E^ by using the same change of variable u = *Jnx to obtain: 



El < C' p n- p ' 2 . 

Finally by substituting ( 1371) . ( 1381) and ( 1391) in ( 1361) we obtain the thesis. 



(39) 
□ 



25 



6.2 Proof of Theorem [3] 

As customary in this literature, the proof can be divided into different cases, as follows. 
• Regular zone, p < oo 
We start as usual from 



e\\f:-f s \\i % 



E 



E 



Y Y W jkPjk^3k;s ~YY Pjk^jk;! 
j<J„ k j k 



L?(S 2 ) 



< E 



Y Xj''>V> ~ Pjk-Mjk;s + Y Y Pjk-^jk;! 

j K tin fe 

p 



i?(S 2 ) 



Y Y { "'< k W - Pjk-Mjk-,* 

j<Jn k 

:/ + //. 



Lf(S2) 



Y Y P jk-^ jk;t 

j>Jn k 



Lf(§2) 



For p < ir, we have B r m . s C B r pq . s1 whence we can always take n = p in this case; hence we 
focus on p > 7r. Here we have the embedding B^ q . s C B pg J p , whence 



Y Y Pjk;s4>jk;s 

j>Jn k L P (§2 ) 

and because in the regular zone 



OB 



-m-m^ = o 



n 



logn 



^2 n^p' 



2 r 
r > — 



rp 



< — 



we obtain 



,r 1 1 



vr ' 2r + 2 2(r + l)p ~ 2p 



r ,r 1 1, rir A l w r7r 

> n -z + z)- 7e = (-- -)hr -i)> o 



2 7r p 2r + 2 2 7r p 2p 7r p 2 
Hence the bias term is fixed. For the variance term we have 



i < jt 1 Y E 

j<Jn 



i?(S 2 ) 
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Now we split / in four zones; more precisely, we shall label A (respectively U) where the 
estimated coefficients is above (resp. under) the threshold nt n , and a (respectively u) the 
regions where the deterministic coefficients are above or under a new threshold, which is |t n 
in A and 2nt n in U. We hence obtain 



5> 



3<Jn 



L5(SP) 



E B 



+ E* 

3<Jn 



+ E* 

j<Jn 



^2(Wjkdjk;s ~ ftjk;s)*Pjk; 

^2(Wjkdjk;s ~ ftjk;s)^jk; 
k 

^^{ w jkftjk;s ~ Pjkis^jk; 
k 



L?(S2) 
P 



I {|3jfc i .|>«*n} I {|/9j*;.|>«f»/2} 



L?(S2) 
V 



^\P jk;s \>Kt n }\\/3 jk . s \<Kt n /2} 



L?(S2) 



1 {|3jfc s .|<' rf »} {|^ifc;.|^ 2 ' 6 *»} 



Lf(§2) 



I {|3 jfc;s |<«tn} 1I {|/3 jft;a |<2 K i n } 



< c EE life 



li?(S 2 ) 



.j<Jn k 



+ E E ll^i fc ; s IL?(§ 2 ) l^'feisl ^ 



ftjk;s ft jk 

ft jk\s ftjk;s 
I 



I 



{\P Jk ^\>Ktn} l {\p jh ..\>Ktn/2} 



I {\P ]k . s \> K t n } I {\f3 ]k . s \<Kt n /2} 



{\P 3k . s \<Kt n } l {\P ]k . 8 \>2Kt n } 



j<Jn k 



+ E E II*. 



j<Jn fc 



I {|^ ;s |<^n} II {|/3 jfc;s |<2Kt n } 



Aa + Au + Ua + Uu 



This idea is the same as in [5], where the regions are labelled instead Bb, Bs, Sb, Ss; we 
preferred to avoid B and b which have a different use in the present work. Heuristically, the 



cross/terms Au, Ua are easier to bound, as we can exploit quick decay of Pr 



ftjk;s ftjk;. 



> H 

2 r ' 



for Aa, Uu the crucial bounds will be derived by the tail behaviour in the Besov balls B T pq . s (G) 
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Note firstly that 



j<J„ k 



I 



{\/3 jh . a \>Kt n /2} 



< ^EE^ 2)I {i^ s |^/ 2 } 

j<Jn k 



E 



fijk;s ft ' jk;s 



now from [T71 and 151 we know that 



E 



fijk;s fijk;s 



< C p n~ p/2 , J2B j{p - 2) = 0(B 1P ) . 



Write 



EE B ^ 2)l {|/3 jfe;s |>«<„/2} 
j<J„ k 



E 



fijk;s $ jk;s 



L j<-hn k j>Jln k 

\ j>Jln k 



j>Jln k 
{\H jk , s \>Kt n /2\ 



{|/W|>«*«/2} 



Fix 



logn 



and note that we have 

j>Jl„ k 



{\p jk ..\>lOn/2} 



< 



j>Jln k 



n 



log n 



p/2 



E El". 

j>^ln I k 



jk;s\ ||' ? / ; jfc;s|| L P(§2) f j 



where 



< CB- prj , 
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because by assumption F s e B r .Hence 



fef e{ei**.ni*. 

k 6 J j>Ji» I k 



< 



n 



logn 



p/2 



B -pr.hn < Q 



n 



jk;s\\L p a (S 2 ) 
p/2 



logn J [logn 



pr 
"2(r + l) 



< c 



f n 1 


p(r + l) — pr 


r « i 


V 


2(r + l) 


2(r+l) 


I log n ^ 


logn^ 



and 



cEE^^i/wi*^ 



j<J„ k 

< Cn- p/2 B pJln < C 



n 1 2IF+T) 



logn 



n 



P 'jk;s fijk;. 



-P/2 < Q 



n I 2('-+ 1 ) 



logn 



Hence the term is fixed. For the term Uu, it suffices to observe that 



j<Jn k 



< c{ E E^' (p ~ 2) i 2 ^i p + E E^' (p ~ 2) l/wr} 

Vi<^ln fe j>Jln k ) 



d<Jln 

< ClB ph 



n 



< C 
Now note that 



n 



log n 



logn 



2(r + l) 



-p/2 



n 



logn, 



+ 



n 



logn 



pr 
"2(r + l) 



n 



logn, 



"2(r + l) 



j<J n k 

< EE 5 ^ \ E 

j<Jn k ^ 

and using ffTTl) 



fijk;s ft ' jk;s 
fijk;s fijk;s 



I 



{\Pjk;s-Pjk; S \>Ktn/2} 



2p 









}>[ 


fijk;s fijk;s 


> Kt n /2 



Au < Cn- p/2 B pJn n-^ /2 < Cn- p/2 



n 



logn 



p/2 



n 



-7/2 



C (logn) 2 n 



-7/2 
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Finally 



jr'<J„ fc 

Because obviously n~ 7 < n -7 / 2 we have to choose 7 such that: 



n 



-7/2 < 



n 2 '-+ 2 — > 7 > 



pr 
r + 1 



We can hence take k ~ 7 s / 4 , which yields 



k > C 



pr 
r + 1 



The case p = 00 



Assume first that F s e B^^.^. Then 



£||F S * - F a || L?)(s2) < ,5 



j<Jn k 
I + 11 . 



+ 



L-(§2) 



j>Jn k 



L-(§2) 



For 77, it is sufficient to note that 



j>J„ k 



£ E 

lj° (§ 2 ) i >Jn 
= O 



E ^jk;s^jk; t 
r/2 s 



k 

n 



O (B~ rJn ) 



Lf(S2) 



logn 



n 



logn 



-r/2(r+l) N 



On the other hand, 



E 



Z^A* ~ Pjk-Mjk;, 

j<Jn k 



Lf°(S 2 ) 3<Jr, 



j<Jn 



W jkPjk;s ~ fijk;s 



Lf°(S 2 ) 



sup 

k 
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3<Jn 



sup 

k 



sup 

k 



+c BjE 

j<Jn 

+Cj2 Bj ™p\/3 jk ., s \E 
+C^^sup|^. fc . s |l 



ft jk;s ftjk: 

ft jk;s ftjk; 
I 



{|j8,- fc ..|>«t n /2} 



1 1 



j<Jn 

Aa + Au + Ua + Uu 



{\Pjk;s-Pjk;s\> Kt ™}_ 
{\P jh .,\<2Ktn} 



Now as before, we note that it is possible to choose 

J ln : 11' ■ ~ /, 
Hence, by (TTSjl 



n \ 2 (''+ 1 ) 

' ' > and for j > J\ n , I 

logn J 



{\/3 jk ., s \>Kt n /2} 



= . 



< C BjE 

j<Jln 

< CJ ln (log n 



sup 

k 



sup 



,-1/2 



ftjk;s ftjk;s 
ftjk;s ft jk\s 



I 



{|0jhs.|>«W2} 

< CJ ln n-^B Jln 



n 



log n 



2(r + l) 



Also 



E^'^pI^I^i^i^} < c'j^ Jl "+ £ ^'supi^ij 

< C{t n B Jln +B- Jln \ < c\-^—\ 

{ log n J 
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For the remaining two terms the arguments is the same, actually easier. For general tt and q, 
it is sufficient to note that fi£ C -B^ j00;s , r' = r — 2 /it. By the previous argument 



E II TP* — TP II 
^ \\ r s r s\\L°° 



<CJ n 



n 



logn 



"2(r'+l) 



n 



logn 



V — lj-K 



2(r-2(l/7r-l/2) 



Note that for 7r = p = oo the sparse and regular zone coincide; otherwise for p = oo we are 
always in the sparse zone 



• The sparse case 

r _2(l-I) 

The argument is very much the same as before. Indeed we have B r C B p ^ s * v , 



E\\F* S -F S \\ P LP < E 



^2( W A;s - Pjk;s)^jk; t 
j<J„ k 



+ 



£?(S 2 ) 



j>J n k 



L?(§ 2 ) 



y^ y^ Pjk-^jk-,. 

j>Jn k 



< CB 



-Jn(r-2(i-I)) 



< OB 



-2J„[(r-2(i-I))/2(r-2(i-i))] 



L?(S 2 ) 



< 



logn 



-[(r-2(J 



L))/2(r-2(i-I))] 



because r — | + 1 > 1, given that r — | > by assumption. Hence the bias term has the correct 
order. For the variance term, the trick is very much as above, and we omit some details. It is 
possible to split the term to be bounded into four terms, after which the two "cross terms" 

Au and Ua are easy because they involve quantities like P ||/3,- fc . s — Pjk-, s \ > K ^n| , which can 

be made smaller than rT p l 2 for all p > 0, given a suitable choice of k. Fix J 2n such that 



so that 



B 



Jlr, 



n 



logn 



fl 2((r-4)+l) 



logn 



n 



logn 

n 
logn 



7T — p 

~~2~ 



n 



_, (p-7r(r + l)) 
2((r-^) + l) 



logn 



U-p)((r-f ) + l)| + (p-7r(r+l)) 
2((r-^) + l) 
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For the terms of the form Aa and Uu we have 



^'»-' /2 £ B3<p - 2, Ei{|^.| S «w,}+^ 1 E B *- 2, Ei/ J *»i'i{i» 

j<Ji„ k j k 



■jk;s 



<2Kt n j 



where to obtain the first summand we have exploited the embedding B^ q . s C £>oo,co ; s, whence 
for j > J 2n one has I{^. fc;3 |> Ktn/2 } = 0. Now 

-P/2 Rj(p-2)Vl fl , , 

2^ ^ 2^{|/w|>«W2} 



j<J2n 



< Cn- p/2 t-* B j{p - n) B-™ j < C 

j<J2n 



n 



logn 



^^2n(p-T(r+l)) 



Likewise 



j<-hn 



< C 



n 



< C 



logn 



n 



E ^' (p - w) E 5 ^~ 2) i^r 



Now 



(7r-p)((r-|) + l) + (p-7r(r + l)) 
2((r-f) + l) 



logn 

7r(r + 1) - 2 - + ^ - p) + (p - vr(r + 1)) 
2((r -I) + 1) 
2 + pr-f p(r-2(I-I)) 



1 v 
JLL 

1 1 ' 

7T 



2((r-f) + l) " 2(r-2(I-|)) 
that is, these terms have the right order. So we are only left with 



j>J2n k 



<2«t 



} 
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Consider 

p- 2 

m 



r - 1 + 1 



note that 



pr — ^ + p — p + 2 
f — 

pr - % + 2 



p-m = f 

r — - + 1 



r- 2 + 1 



> 



p-2 

m — 71 = k 7T 

r - - + 1 

= p - 7r( ; +1) >o, 

r-f + l 

because p — 7r(r + l)>0in the sparse zone. We have 

E 5iM) Ei^i ?I {i^i< 2 *} < c7E^ M Eiwv 

j>Jln k j>J2n k 

< Ct v ~ m B ]{p - m) ^B j{m " 2) \P jk . s \ m 

j>-hn k 

j>J2n k 



e 7T m 



Now, because B^ g;a C £> m , g;s , 

ElU-i IT \B., \ m < CS~ TOi(r ~* + ™ ) 

fc 

hence f HUj) is bounded by 

J2n<3<J -hn<j<J 

Observe that 

22 pr - ^ + 2 2 

(p — m) — (r 1 )m = £ ( r ) m — 2 

TV 

pr- 2 f + 2 2 A p-2 

= ^ (r ) Tj 2 

r- - + 1 7r> - 1 + 1 

7T 7T 

2r + 2(l-^) 
= : ill _ 9 = n 

r — - + 1 

7T 
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hence 



^ (p - 2) £l/W PI {| 



< CJ n t p ~ m <C\ogn 



n 




P jk , a \<2Kt n } 



logn 



J2n<j<J k 



Thus the proof is completed. 



□ 
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